MojoStorePagedSingleCache for single K/V paged store#372
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the MojoStorePagedSingleCache operator and its backend implementation IxformerStorePagedSingleCache, allowing the storage of a single attribute (key or value) into a single paged cache. It also includes comprehensive unit tests to verify accuracy and alignment with the full KV store. The review feedback suggests adding an explicit dimension check for the cache tensor to prevent potential indexing errors, as well as a device compatibility check between states and cache to avoid runtime failures.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| Returns: | ||
| torch.Tensor: Updated ``cache`` after in-place writes. | ||
| """ | ||
| assert len(states.shape) == 3, "states must be (token_num, kv_head_num, head_dim), please check." |
There was a problem hiding this comment.
The cache tensor is expected to have exactly 4 dimensions (total_phys_blocks, kv_heads, block_size, head_dim). Adding an explicit dimension check for cache (similar to the check for states) will prevent unexpected IndexError when accessing cache.shape[2] and provide a clearer error message.
| assert len(states.shape) == 3, "states must be (token_num, kv_head_num, head_dim), please check." | |
| assert len(states.shape) == 3, "states must be (token_num, kv_head_num, head_dim), please check." | |
| assert len(cache.shape) == 4, "cache must be (total_phys_blocks, kv_heads, block_size, head_dim), please check." |
| if cache.dtype != states.dtype: | ||
| raise ValueError("IxformerStorePagedSingleCache requires states and cache to have the same dtype.") |
There was a problem hiding this comment.
It is highly recommended to verify that states and cache are on the same device. A device mismatch between these tensors will cause runtime failures or silent errors during execution.
| if cache.dtype != states.dtype: | |
| raise ValueError("IxformerStorePagedSingleCache requires states and cache to have the same dtype.") | |
| if cache.dtype != states.dtype: | |
| raise ValueError("IxformerStorePagedSingleCache requires states and cache to have the same dtype.") | |
| if cache.device != states.device: | |
| raise ValueError("IxformerStorePagedSingleCache requires states and cache to be on the same device.") |
Claude Code ReviewVerdict: Request changes -- The Ixformer single-cache backend passes the same tensor as both K and V to the underlying paged store, which will corrupt the cache for any mode that actually writes both. SummaryAdds a new Must fix
SuggestionsSuggestions (3)
NitsNits (2)
Notes
|
Claude Code ReviewVerdict: Request changes -- The ixformer single-cache backend ignores the SummaryAdds a new Must fix
SuggestionsSuggestions (3)
Notes
|
MojoStorePagedSingleCache, a single-tensor variant of MojoStorePagedKVCache that writes only one attribute (key OR value) into a paged cache. Supports both block_table and chunk_metadata paths, with an ixformer backend and accuracy tests.